1 IMFORMATIOM EXTRACTION SYSTEM, IMFORMATION PROCESSING 

2 APPARATUS, INFORMATION COLLECTION APPARATUS, CHARACTER STRING 

3 EXTRACTION METHOD, AND STORAGE MEDIUM 

4 Field of the Invention 

5 The present invention relates to an information processing 

6 method for monitoring the manipulation by a user of data on the 

7 screen of a computer display device, and for obtaining other 

8 related information. 

Ji9 Background Art 

pO Because the commercial use of the Internet, such as for on-line 

fil shopping or for the dissemination of advertising material using 

0i'2 banner ads, has become so popular, there is great interest in 

ij.3 improving and maximizing the effects produced by this Internet 

ni4 application. Web site managers perform research to obtain the 

fli5 reactions of users (web audience ratings) to web page content, 

C-ie and the results provided by the research are reflected in the 

"l7 subject matter published on web pages or in the design of web 

18 sites, or are used for One- to-One marketing. 

19 In order to obtain information concerning the web site subjects 

20 or themes users are most attracted to, conventional web 

21 audience rating research methods include the provision of 

22 questionnaires that site visitors are requested to complete, 

23 and means for garnering browser access information, including 

24 page display time and the number of page visits, that is 
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1 subsequently used to prepare estimated user reaction profiles. 

2 The access information referred to here is the number of HTTP 

3 access requests (the number of hits) received by a server, and 

4 other information concerning the browsing of specific web 

5 contents that are acquired by a client. 

6 According to the web audience rating method according to which 

7 users are requested to complete questionnaires, the research is 

8 conducted by asking the users for informative entries. 

9 Specifically, a questionnaire page, for example, is prepared in 

10 advance for inclusion in web contents, so that users can select 

11 interesting topics and keywords. Either this, or distributed 

12 across web pages are buttons labeled "Interesting" or "Boring" 
J|3 that viewers are invited to select and click on. According to 
T|4 this method, since the information is obtained as a result of 
gl|5 informative input operations performed by users, the obtained 
{l6 information can be used to very reliably track user interest 
til trends . 

pis As one type of information that can be obtained by a server for 

2|9 use in web audience rating research, the count of HTTP access 

C|0 requests (the number of hits) issued for web page contents is 

^1 heavily relied on. When a web page is available and can be 

22 read using a web browser, and when an image is embedded in a 

23 web page or framing is employed, the number of hits received 

24 for the specific page is counted. In this case, a web server 

25 does not accept an HTTP access request when it is moving from 

26 one set of web page contents to another. 

27 According to this method, all the content (resource) accesses 

28 initiated by a user can be recorded. And when this data is 

29 combined with information concerning the resource type (HTML 



DOCKET iSrUMBERs JA919990254US1 



-2- 



2 



1 



files, images, etc.) involved, the length of time the user 
spent viewing the predetermined web contents can be estimated. 



3 Since a client can monitor the state of a window that is 

4 displayed by a web browser, a client is able to obtain more 

5 detailed information than is a server. For example, a client 

6 can measure the display time for each page, and for windows can 

7 record and examine all changes in location and all sizes and 

8 resolutions used for focusing, while at the same time recording 

9 keywords selected by a user's manipulation of a data entry 

10 device. Additionally, the browsing history of a user can be 

11 recorded, without it being limited to a specific web site. 

12 Based on the information obtained by employing such a method, 
5|3 user interest trends can, to a degree, be estimated. 

S4 In addition, available for use for research are the search 

n|5 engines that users employ to obtain desired information. When 

gls using a search engine, a user enters a keyword and clicks on a 

;^17 start button or presses enter, and the search engine then scans 

kl8 a number of web pages for the keyword. Subsequently, if web 

Oi9 pages containing the keyword are found, the search engine 

^2 0 displays them in a listing. For this process, however, because 

-ll of the huge number of web pages that are available, it is 

22 important that some restriction be applied that can 

2 3 appropriately reduce the number of pages scanned. As a 

24 technique for accomplishing this, of the pages listed as a 

25 result of one search a user selects a new keyword from a page 

2 6 that best matches his or her interest, and uses the new keyword 

27 to initiate another search. In this case, by using a keyword 

2 8 extracted from a document that the user selected as the one 

2 9 that most nearly matched the purpose of the search, the search 

3 0 conditions are automatically changed. Thus, the trend 
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1 corresponding to of the user's interest will be reflected in 

2 the search results. In this case, a keyword that is employed 

3 is one that is representative of the entire page that is 

4 selected. 

5 Problems Solved by the Invention 

6 However, when information concerning user interests, such as 

7 the subject and the theme to which a user's attention is drawn, 

8 is acquired by using a conventional web audience rating search 

9 or by using a conventional search engine, the amount and 

10 reliability of the data obtained are not satisfactory. Since 

11 when employing the method according to which users are 

Jj2 requested to complete questionnaires, the work involved in 

IJ3 filling out the questionnaires is imposed on the users, a high 

response rate can not be obtained. Similarly, while taking 

fi's into consideration the load that is to be imposed on users, it 

rie is difficult to issue a complicated questionnaire in which an 

"17 evaluation is requested for each item, such as each sentence, 

f||8 appearing on a page. Further, to request the questionnaire, 

219 pages and buttons for the questionnaire must be prepared, so 

clo that obtaining information concerning arbitrary web contents is 

^^il not an easy task. 

22 According to the method for estimating the audience rate by 

23 using the information acquired by the server machine or the 

24 client machine, the information obtained by the server consists 

25 simply the number of hits web contents receive, as described 

2 6 above. From this, the time a user spent viewing predetermined 

27 web contents can be- estimated, but detailed information, such 

2 8 as which web page the user read and the time the user spent 

29 reading it, can not be obtained for each web page. 
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1 These data could be acquired, however, were a client machine 

2 capable of monitoring the state of a window that is displayed 

3 by a web browser. But since means for monitoring a web browser 

4 would have to be mounted on a client machine as an application 

5 program or as a proxy server, and since control of the 

6 monitoring operation would have to be exercised from outside 

7 the web browser, the data structure of a web page can not be 

8 accessed. As a result, the manipulation of an object in an 

9 HTML document by a mouse can not be recorded, and thus, 

10 detailed information, such as which portion of a web page a 

11 user is particularly interested in, can not be acquired. 

%2 A method for accessing information acquired and presented by a 

"l|3 web browser can be one for which a Java applet is used for 

ril|4 mounting the web contents. However, since with this method 

l%5 only the contents of the Java applet could be obtained, it is 

016 not appropriate for application for a common web page. 

: : 
s=B a 

nil Further, as is described above, according to the method for 

L^i8 changing the search conditions for the operation of a search 

ri9 engine based on an evaluation that is made of a user, a keyword 

'-ho used for this purpose is extracted from a document that 

21 constitutes a target web page. Thus, a portion (a sentence or 

22 a word) that the user pays particular attention to in the 

23 document can not accurately be reflected, in detail, by the 

24 search condition. 

25 Summary of the Invention 

2 6 To resolve the above technical shortcomings, it is one object 
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1 of the present invention to eliminate the need for clear input 

2 by users, and to permit users to obtain detailed information 

3 concerning those portions of web contents in which they are 

4 most interested. 

5 Further, it is another object of the present invention to 

6 extract detailed information concerning operations performed by 

7 users, including operations involving the use of web browsers, 

8 so that this information will be available and can be used when 

9 user interest trends are being plotted. These and other 

10 objects of the present invention are achieved as subsequently 

11 described. 

H2 Brief Description of the Drawings: 

niB These and other aspects, features, and advantages of the present 

Si4 invention will become apparent upon further consideration of the 

h15 following detailed description of the invention when read in 

^15 conjunction with the drawing figures, in which: 

1-17 Fig. 1 is a diagram for explaining the overall arrangement of 

CIL8 an information extraction system according to an example 

19 embodiment of the present invention; 

2 0 Fig. 2 is a conceptual diagram for explaining the functions of 

21 an operating event detector 10, an operating event analyzer 2 0 

22 and a text extractor 3 0 according to the embodiment; 

23 Fig. 3 is a diagram for explaining a program required when 

24 dynamic HTML is employed to carry out the text extraction 

25 process of the text extractor 30 when text selection is 
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performed; 



2 Fig. 4 is a diagram for explaining the text extraction process 

3 when text selection is performed; 

4 Fig. 5 is a diagram for explaining a program required when 

5 dynamic HTML is employed to carry out the text extraction 

6 process of the text extractor 30 when pointing to a link is 

7 performed; 

8 Fig. 6 is a diagram for explaining the text extraction process 

9 when the pointing to the link is performed; 

jjo Fig. 7 is a diagram for explaining the process for identifying 

Ifl the line immediately above the line the mouse pointer overlaps 

A2 during a tracing and reading operation; 

n\ 

513 Fig. 8 is a diagram for explaining the text extraction process 

^^14 when a tracing and reading operation is performed; 

ni5 Fig. 9 is a diagram for explaining a mode for providing an 

rlG information extraction system according to the present 

^l7 invention; 

18 Fig. 10 is a diagram for explaining another mode for providing 

19 an information extraction system according to the present 
2 0 invention; 

21 Fig. 11 is a diagram for explaining an additional mode for 

22 providing an information extraction system according to the 

23 present invention; 
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1 Fig. 12 is a diagram for explaining one further mode for 

2 providing an information extraction system according to the 

3 present invention; and 

4 Fig. 13 is a diagram showing a comparison between the example 

5 embodiment of the present invention and the prior art when the 

6 text extracted in this embodiment is employed to generate a 

7 keyword vector for a search engine. 

8 Description of the Invention 

9 To achieve the above objects, according to the present 

JPIo invention, an information extraction system comprises a server 

iJl and a client, connected via a communication network, wherein 

A2 the server provides a data file for a client to browse; and 

ni3 wherein the client includes browsing means for displaying the 

Si4 contents of the data file that is received from the server via 

-15 the communication network, operation detection means for 

h|.6 detecting a predetermined specific operation based on a user's 

017 operation when the user reads the contents of the data file 

is; 

ri8 displayed by the browsing means, and means for extracting 

C|l9 information that is displayed at a location whereat the 

20 specific operation that is detected by the operation detection 

21 means is performed on a display screen of the browsing means. 

22 According to the present invention, an information extraction 

23 system comprises: a web server for storing web contents; and a 

24 client for receiving the web contents from the web server, via 
2 5 a communication network, and for displaying the web contents, 

26 the client including an operating event detection function for 

27 detecting, as a manipulation event, an operation performed by a 
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1 user on a display screen of the web contents, wherein a program 

2 package, which is written in a function expansion program 

3 language for expanding the functions available to the client, 

4 is embedded in the web contents stored in the web server, the 

5 program package permitting the client to perform a process for 

6 employing the operating event detection function of a client to 

7 detect an operating event, a process for analyzing a string of 

8 operating events that are detected to extract a predetermined, 

9 specific operation, and a process for extracting from the web 

10 contents target information for the specific operation, and for 

11 returning the target information to the web server. This 

12 arrangement is superior because when an information processing 

13 apparatus accesses the web contents in which a web contents 
1^4: creator has embedded a program package, the information 

"lis processing apparatus can obtain information concerning the 

56 contents that a user is interested in. The obtained 

Il7 information can then be employed for services, such as research 

hi 

pis performed to ascertain web audience rates and a reduction in 

^19 the search conditions for a search engine. 

nio Furthermore, according to the present invention, an information 

ol 

p21 extraction system comprises: a web server, for storing web 

Ci2 contents; and a client, for receiving the web contents from the 

23 web server, via a communication network, and for displaying the 

24 web contents, wherein the client includes an operating event 

25 detection function for detecting, as a manipulation event, an 
25 operation performed by a user on a display screen of the web 

27 contents, wherein the web server embeds, in the web contents, a 

28 program package, which is written in a function expansion 

29 program language, that expands the functions available to the 

30 client and that permits the client to perform a process for 

31 employing the operating event detection function belonging to 
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1 the client to detect an operating event, a process for 

2 analyzing a string of operating events that are detected to 

3 extract a predetermined specific operation and a process for 

4 extracting target information for the specific operation from 

5 the web contents and for returning the target information to 

6 the web server, and wherein the web server transmits the 

7 program package to the client. This arrangement is superior 

8 because the web server can obtain information concerning which 

9 of the stored web contents the user is interested in. The 

10 obtained information can then be employed for services, such as 

11 research performed to ascertain web audience rates and a 

12 reduction in the search conditions for a search engine. 

13 Further, according to the present invention, an information 
'3J4 extraction system comprises: a web server, for storing web 
Ms contents; a proxy server, for receiving web contents from the 
Rie web server via a communication network and for performing an 
Jjiv additional process; and a client, for displaying the web 

"18 contents for which the proxy server has performed the 

kl9 additional process, wherein the client includes an operating 

nio event detection function for detecting, as a manipulation 

ni 

p21 event, an operation performed by a user on a display screen of 

Ci2 the web contents, wherein the proxy server embeds, in the web 

23 contents received from the web server, a program package, which 

24 is written in a function expansion program language, for 

25 expanding the functions available to the client, and that 
2 6 permits the client to perform a process for employing the 

27 operating event detection function belonging to the client to 

28 detect an operating event, a process for analyzing a string of 

29 operating events that are detected to extract a predetermined 

30 specific operation and a process for extracting target 

31 information for the specific operation from the web contents, 
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1 and for returning the target information to the proxy server, 

2 and wherein the proxy server transmits the program package to 

3 the client. This arrangement is superior because the proxy 

4 server can obtain information concerning which of the stored 

5 web contents the user is interested in. The obtained 

6 information can then be employed for services, such as research 

7 performed to ascertain web audience rates and a reduction in 

8 the search conditions for a search engine. 

9 Instead of transmitting the program package to the client, the 

10 proxy server may include: operating event acquisition means, 

11 for collecting operating events that are detected by the 

12 client; operating event analyzation means, for analyzing a 
string of the operating events that are received from the 

Tj4 client and for extracting a predetermined specific operation; 

gb5 and information extraction means, for extracting, from the web 

contents, target information for the predetermined specific 

S? operation. This arrangement is preferable because, based on 

fl8 the operating event, a proxy server can extract information 

pl9 concerning specific operations, and information concerning data 

2|0 that users are interested in, so that the load imposed on 

^|1 clients can be reduced. The web contents from which a proxy 

C|2 server extracts information may be those that the proxy server 

23 receives, from a web server, and stores, or may be those that 

24 are requested from a client when the information is to be 

25 extracted. 

2 6 Moreover, according to the present invention, an information 

27 extraction system comprises: a web site, for storing web 

28 contents; an information processing apparatus that includes a 
2 9 web browser, for receiving the web contents from the web 

30 server, via a communication network, and for displaying the web 
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1 contents; and a portal site, for the information processing 

2 apparatus, wherein the portal site, upon being accessed by the 

3 information processing apparatus, transmits, to the information 

4 processing apparatus, a program file that serves as a local 

5 proxy for the information processing apparatus, wherein the web 

6 browser of the information processing apparatus includes an 

7 operating event detection function for detecting, as an 

8 operating event, an operation performed by a user on a screen 

9 on which the web contents are displayed, wherein the local 

10 proxy, which is operated by the information processing 

11 apparatus, embeds in the web contents received from the web 

12 server a program package, which is written in a function 

13 expansion program language, for expanding the functions 

1^4 available with the web browser, the program package permitting 

the web browser to perform a process for employing the 

2^6 operating event detection function belonging to a web browser 

in? to detect an operating event, a process for analyzing a string 

; - 3 

Sis of operating events that are detected to extract a 

=19 predetermined, specific operation, and a process for extracting 

p20 target information for the specific operation from the web 

nil contents, and wherein the web browser transmits, to the portal 

^|2 site, information extracted by the web browser. This 

Ci3 arrangement is superior because a portal site can obtain 

24 information concerning which of the web contents that an 

25 information processing apparatus received interested a user. 

2 6 The obtained information can then be used for services, such as 

27 research performed to ascertain web audience rates and a 

28 reduction in the search conditions for a search engine. 

29 According to the present invention, an information processing 

3 0 apparatus comprises: browsing means, for displaying document 
31 data; operation detection means, for employing an input 
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1 operation, performed by a user when the user browses the 

2 document data displayed by the browsing means, to detect an 

3 operation defined as a specific operation that the user 

4 unintentionally performed to obtain interesting information; 

5 and character string extraction means, for extracting a 

5 character string that is displayed at a location whereat the 

7 specific operation that is detected by the operation detection 

8 means is performed on a display screen of the browsing means. 

9 An operation that a user unintentionally performs to obtain 

10 interesting information differs from an active, intentional 

11 effort, such as when a user inputs information to complete a 

12 questionnaire. This operation constitutes an unintentional act 

13 that occurs while the user is reading a document carefully, 
%i such as when the user is reading text while tracing it with a 
lis mouse pointer, or such as when the user is reading text within 
Ss a selected range. The above arrangement is preferable because, 
Hv when this operation is detected and the target information for 
gis the operation is obtained, information can be obtained 

^19 concerning those contents in which a user is interested without 

f|20 requesting the user to actively and intentionally input 

[jil information. 

Cl2 The character string extraction means extracts a sentence unit 

23 or a line unit that includes the character string that is 

24 displayed at the location whereat the specific operation is 

2 5 performed. To extract the sentence or the line as a unit, the 

26 location of a return code or the delimiter for the sentence or 

27 the line is detected by extending the range of the character 

28 string that is to be extracted, and subsequently extracting the 

29 text within that range. This arrangement is preferable because 

3 0 the contents which interest the user can be extracted as 
31 information that conveys a specific meaning. 
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1 According to the present invention, an information collection 

2 apparatus that is connected to an information processing 

3 apparatus, which includes a web browser that receives web 

4 contents from a web server and displays the web contents, and 

5 which collects information concerning the information 

6 processing apparatus, comprises: storage means, for storing a 

7 program file for embedding, in the web contents received from 

8 the web server, a program package, which is written in a 

9 function expansion program language, that expands the functions 

10 of the web browser and that permits the web browser to perform 

11 a process for employing an operating event detection function 

12 performed by the information processing apparatus to detect an 

13 operating event, a process for analyzing a string of operating 
1^ events that are detected to extract a predetermined specific 
.is operation, and a process for extracting target information for 
fU6 the specific operation from the web contents; transmission 

37 means, for reading the program file from the storage means and 

-18 for transmitting the program file to the information processing 

^9 apparatus; and information collection means, for collecting the 

HdO information extracted by the information processing apparatus. 

□l The program file stored in the storage means of the information 

22 collection apparatus is prepared by a Java applet, and the 

23 program package, which is written in Java script, is embedded 

24 in the web contents. This arrangement is preferable because 

25 information can be extracted by using a web browser that 

26 corresponds to the Java language that is widely employed by 

27 personal computers. Furthermore, this arrangement is superior 

28 because, since the program file is prepared by a Java applet, 

29 it need not be distributed in advance to the information 

30 processing apparatus. 
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1 Further, according to the present invention, a character string 

2 extraction method comprises the steps of: detecting a 

3 predetermined, specific operation based on an input operation 

4 performed by a user on a display screen on which document data 

5 are displayed; and extracting, as a unit, a sentence or a line 

6 that includes a character string that is displayed at a 

7 location whereat the specific operation that is detected has 

8 been performed on the display screen. 

9 In addition, according to the present invention, a character 

10 string extraction method comprises the steps of: detecting, 

11 based on an input operation performed by a user on a display 
li2 screen on which document data are displayed, a tracing and 
"113 reading movement by which the pointer of a pointing device is 
3^4 moved along lines in a document that is displayed; and 

nils extracting, as a unit, a sentence or a line that includes a 

Li ^ 

i^:^5 character string that is displayed at a location whereat the 

-17 tracing and reading operation has been performed on the display 

Sis screen. This arrangement is especially superior because, when 

ni9 tracing and reading are performed, the text at the location 

pio whereat the tracing and reading are performed can be extracted, 

□l without requiring any active, intentional input operation by a 

22 user. Further, this arrangement is preferable because a 

23 sentence or a line unit is employed to extract the character 

24 string, so that the contents in which a user is interested can 

25 be extracted as information that establishes a specific 
2 6 meaning. 

27 At the step of extracting a character string, a sentence or a 

28 line that includes a character string belonging to a document 

29 immediately above a character string selected using the pointer 
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1 is moved to another location on the display screen. For when a 

2 user reads a document while tracing it, he or she may read a 

3 line immediately above the line whereat the mouse pointer is 

4 located. This arrangement is superior because information that 

5 the user seems to be interested in reading can be thoroughly 

6 extracted. 

7 Furthermore, according to the present invention, a character 

8 string extraction method comprises the steps of: employing an 

9 input operation performed by a user on a display screen on 

10 which document data are displayed to detect a line tracing and 

11 reading operation during which lines of a displayed document 

12 are pointed at in order, while the pointer of a pointing device 

13 is moved in a direction perpendicular to the lines; and 
ija extracting as a unit a sentence or a line that includes a 

15 character string that is displayed at a location whereat the 

3^6 line tracing and reading operation has been performed on the 

111? display screen. This arrangement is especially superior 

Ms because when a user reads a long sentence while moving a mouse 

=19 in the direction perpendicular to the lines of text, the text 

So whereat the tracing and reading operation is performed can be 

Tdl extracted, without an active, intentional input operation being 

1^2 required of the user. 

23 For horizontal text, a tracing and reading operation is 

24 detected in accordance with the movement of a pointer in the 

25 transverse direction that matches the direction of lines, and 

26 the line tracing and reading operation is detected in 

27 accordance with the vertical movement of the pointer 

28 perpendicular to the lines. On the other hand, for vertical 

29 text, the reading and tracing operation is detected by the 

3 0 vertical movement of the pointer that matches the direction of 

31 lines, and the line tracing and reading operation is detected 
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from the transverse movement of the pointer perpendicular to 
the lines. 



3 According to the present invention, provided is a storage 

4 medium on which the input means of a computer stores a 

5 computer-readable program that permits the computer to perform: 

6 a process for displaying the contents of document data; an 

7 process for detecting a predetermined specific operation based 

8 on a user's operation on a display screen where the document 

9 data are displayed; and a process for extracting a character 

10 string that is displayed at a location whereat the specific 

11 operation that is detected is performed on the display screen. 

12 This arrangement is superior because when an information 
hp processing apparatus loads a program and displays document 
lj4 data, information can be obtained concerning the contents of 

the document in which a user shows an interest. When the 

115 obtained information is transmitted to a server, it can be used 

%il for services, such as research performed to ascertain web 

-18 audience rates and a reduction in the search conditions for a 

search engine. 

f% 3 

Preferred Embodiment 

21 The preferred embodiment of the present invention will be 

22 described in detail while referring to the accompanying 

23 drawings. First, an overview of the present invention will be 

24 given. 

2 5 According to the present invention, it is assumed that a 

26 relationship is established between the unintentional movement 

27 of a mouse and what a user is interested in when the user is 
2 8 browsing a document displayed on a computer screen, and the 
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1 characteristic movement of the mouse is detected in order to 

2 extract information that it is assumed is interesting to the 

3 user. Since the information concerning the interest of the 

4 user is extracted based on the movement of the mouse, a target 

5 in which the user shows an interest can be specified by using a 

6 small unit, such as a word or a sentence in a document, or an 

7 inserted table. 

8 In this example embodiment, the following five mouse movements 

9 are defined as operations that the user unintentionally 

10 performs for a target that he or she is interested in. 

11 1, Moving the mouse pointer while the button of the mouse 
'l|2 is depressed (dragging) . 

2. Pointing with the mouse pointer at a link that overlaps 

nil4 a second link. 

y I 

Li i 

Hi5 3. Clicking on the link using the mouse. 

nl6 4. Moving the mouse pointer in the transverse direction 

"17 when the text is being read as the mouse pointer is moved 

Cis along the lines of the text (hereinafter referred to as 

19 tracing and reading) . 

20 5. Using the mouse pointer to designate the line in text 

21 that is currently being read, and gradually moving the 

22 mouse pointer vertically as each line is read (hereinafter 

23 referred to as vertical tracing and reading) . 

24 In this embodiment, the movements are defined for a mouse that 

25 is used as a pointing device. However, when another pointing 
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1 device, such as a track ball or a pen tablet, is employed, it 

2 is assumed that substantially the same movements are performed 

3 for a target that the user is interested in. Therefore, in the 

4 following explanation, the pointing device type is not 

5 particularly designated, and the mouse is employed as an 

6 example. 

7 The operations that a user intentionally performs for an 

8 interesting target are not limited to the five operations that 

9 have been explained. Other arbitrary operations that it is 

10 estimated that a user may perform for a target can be defined, 

11 and can be employed for information extraction. 

h;2 Fig. 1 is a diagram for explaining the overall arrangement of 

r 

-*i|3 an information extraction system according to this embodiment. 

Jt:4 An operating event detector 10 monitors the movement of a mouse 

fiS on a document that is displayed on a computer screen, and 

detects an operating event. An operating event analyzer 20 

Li ; 

sl7 analyzes a string of operating events (hereinafter referred to 

Ks as an operating event string) that are detected by the 

'ni9 operating event detector 10, and extracts a specific operation 

P^O that it seems the user performed for an interesting target. A 

Ql text extractor 3 0 extracts, from the document that is displayed 

22 on the computer screen, the text that is fetched by the 

23 operating event analyzer 20. These components are implemented 

24 as program modules that permit the computer to perform the 

25 above processes. 

2 6 In this embodiment, the display screen of a web browser used to 

27 display web contents that generally are employed on the 

28 Internet is defined as an area wherein the movement of a mouse 

29 is monitored. That is, the operating event detector 10 detects 
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1 an operating event in accordance with the movement of a mouse 

2 across the web contents (a home page) that are displayed by a 

3 web browser, the operating event analyzer 2 0 extracts a 

4 specific operation performed during the detected operating 

5 event, and the text extractor 3 0 extracts the target text as 

6 information that a user is interested in. In this case, the 

7 operating event detector 10, the operating event analyzer 2 0 

8 and the text extractor 3 0 can be implemented by the performance 

9 of a dynamic HTML function. 

10 The operating event detector 10 can be implemented by embedding 

11 it in an HTML file using a script language, such as JavaScript. 

12 In JavaScript, the movement of a mouse, clicking or dragging, 
fib the selection of a character string, the depression/release of 
lJ4 a key, and the scrolling of a screen can be extracted as 

3l5 events. For example, when event handler "OnMouseMove" is 

Hie defined for the movement of a mouse, and is written in an HTML 

fil file, the movement of the mouse can be detected as an operating 

=18 event. Furthermore, also when the movement of a mouse is to be 

monitored on a display screen for a document, other than web 

niO contents, that is prepared by a predetermined application 

i=2l program, the API of an operating system can be employed to 

C22 extract an operating event in accordance with a specific mouse 

23 movement. 

24 The operating event analyzer 20 analyzes an operating event 

25 string that is detected by the operating event detector 10, and 

26 determines whether the operating event string is pertinent to a 

27 specific operation that has been defined in advance. When the 

28 operating event string is pertinent to the specific operation, 

29 the operating event analyzer 20 notifies the text extractor 30 
3 0 that the operation has been performed. Further, the operating 
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1 event analyzer 20 transmits information, such as the position 

2 whereat the operation was performed, to the text extractor 3 0 

3 in order for it to be employed for the extraction of text. The 

4 specific operation that has been defined in advance is an 

5 operation that it is estimated a user will unintentionally 

6 perform for an interesting target. In this embodiment, the 

7 above described operations, i.e., 

8 1. selecting of text, 

9 2. pointing to a link, 

10 3. clicking on a link, 

4} 

iJl 4. tracing and reading, and 

ni2 5. vertical tracing and reading, 

nl 

^13 are defined as specific operations. A detailed explanation 

!ii4 will be given later for the processing used to extract these 

015 specific operations from an operating event string that is 

f|6 detected by the operating event detector 10. 

17 When the text extractor 3 0 receives, from the operating event 

18 analyzer 20, a notification that a specific operation has been 

19 extracted, the text extractor 30 additionally receives, from 

20 the operating event analyzer 20, information such as the 

21 coordinate value required for extraction of the text. 

22 Thereafter, in accordance with the received information, the 

23 text analyzer 30 obtains the target text for the specific 

24 operation from the pertinent position of the web contents that 

25 are displayed by the web browser. A detailed explanation will 



DOCKET NUMBERS JA919990254US1 



-21- 



1 be given later for the text extraction processing performed for 

2 each operation that is extracted by the operating event 

3 analyzer 20. 

4 Then, the obtained text is transmitted to another system that 

5 employs the pertinent text. For example, a system that 

6 conducts research to ascertain web audience rates, or a search 

7 engine can receive the text obtained by the text extractor 30, 

8 and can employ the text as information related to the target 

9 that the user is interested in. 

10 Fig. 2 is a conceptual diagram for explaining the processing 

11 perform by the operating event detector 10, the operating event 
^ analyzer 2 0 and the text extractor 30. In Fig. 2, the 

V3 operating event detector 10, the operating event analyzer 2 0 

|L^4 and the text extractor 3 0 are written in JavaScript and are 

niS embedded in web contents 200. 

Sie While referring to Fig. 2, first, assume that a specific 

^17 operation is performed by using a mouse for predetermined text 

201 in the web contents 200 that are displayed by a web browser 

Rig (211) . Then, the operating event detector 10 detects an 

1=20 operating event based on the movement of the mouse, and 

□l transmits the operating event to the operating event analyzer 

22 2 0 (212) . Next, the operating event analyzer 20 analyzes the 

23 operating event string and extracts a specific operation. 

24 Following this, a notification that the specific operation has 

25 been extracted and information concerning the contents of the 

26 operation are transmitted to the text extractor 3 0 (213) . 

27 Thereafter, the text extractor 3 0 performs a process in 

28 accordance with the specific operation, and extracts the text 

29 201 from the web contents 200 (214) . 
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1 since it is assumed that the thus obtained text 201 is 

2 information that the user was interested in when he or she 

3 browsed the web contents 2 00, this information can be used for 

4 various services, such as research performed to ascertain a web 

5 audience rate and a reduction in the search conditions for a 

6 search engine. The extracted text 201 must then be transmitted 

7 to an operator who desires to use the text 201 as information 

8 concerning the user, and for this various methods may be 

9 employed: the text 2 01 may be embedded in a script form in the 

10 web contents 200 and transmitted by using a function of the web 

11 browser, or a predetermined program may be provided for an 

12 information processing apparatus and its function may be 

13 employed to transmit the text. 

'lJ4 The text acquisition processing for the embodiment will now be 

3l5 described in detail for each of the specific operations. 

W5 First, an explanation will be given for how the selection of 

Sifv text is performed for a specific operation. 

pis From an operating event string that is transmitted by th^ 

nS9 operating event detector 10, the operating event analyzer 20 

f30 detects a "select" event that is generated when a user selects 

□ l text. Based on the "select" event, the operating event 

22 detector 10 obtains a "selection" object that corresponds to 

23 the text selection operation. When the text selection 

24 operation is terminated, this can be identified by a "mouseup" 
2 5 event that is generated when a mouse button is released by a 

2 6 user. For dynamic HTML, when text selection is performed an 

27 area that is selected can be obtained as a "selection" object. 

2 8 Therefore, in a web browser that corresponds to the dynamic 

2 9 HTML, the "selection" object can be obtained immediately at the 

30 time the text is selected by a user. 
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1 The text extractor 3 0 extracts selected text by using the 

2 "selection" object that is generated by the operating event 

3 analyzer 20. Thus, as is shown in Fig, 4, the character string 

4 "cat is very" is extracted from the sentence "This cat is very 

5 smart." The extracted character string "cat is very" is 

6 transmitted to a predetermined system, and is used as 

7 information that the user is interested in. 

8 Fig. 4 is a diagram for explaining the program required when 

9 the text extraction process of the text extractor 3 0 is carried 

10 out using dynamic HTML. The diagram is used for explaining the 

11 text extraction process when the text string "cat is very" in 
the sentence "This cat is very smart" is selected. 

^3 In this example, a " getSelectedText " function is defined as the 

jj4 function used for the extraction of text. The argument for the 

nil5 "getSelectedText" function is the selection object "si," which 

bl 

i46 is generated by a user's selection of text (a "selection" 

-■=17 object 401 in Fig. 4) . On the third line of the program list 

pis in Fig. 3, the TextRange object "tr" ("TextRange" object 402 in 

Mi9 Fig. 4) is generated by the "createRange" method based on the 

f40 obtained selection object "si". The "TextRange" object is an 

Gil object for a text operation using dynamic HTML, On the fourth 

22 line of the program list, the selected text "cat is very" (text 

23 403 in Fig. 4) is extracted by using the "text" property of the 

24 TextRange object. 

25 An explanation will now be given for an example wherein 

26 pointing to a link is performed as a specific operation. 

27 Of the events in an operating event string received from the 

28 operating event detector 10, the operating event analyzer 20 
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1 employs an event that occurs when a mouse pointer is placed on 

2 a link, and an event that occurs when the mouse pointer is 

3 removed from the link, so that a link pointing operation is 

4 detected. In this embodiment, at the same time as these events 

5 occur, text for which a link tag is provided and text that 

6 includes a portion into which a link is extended are extracted 

7 as a unit consisting of a sentence or a line. Further, in 

8 order to exclude a case wherein a mouse pointer simply passes 

9 through a link and a case wherein a mouse pointer accidentally 

10 remains on the link for an extended period of time, the 

11 pointing duration is measured and is used as a determination 

12 condition. 

l{3 Specifically, first, when an event occurs indicating that a 

'li4 mouse pointer has been placed on a link (a "mouseover" event) , 

a time tl for the occurrence is stored. Then, when an event 

ms occurs indicating that the mouse pointer has been moved (a 

StV "mousemove" event) , the position (coordinate value) of the 

^18 mouse pointer on the link is obtained. Following this, when an 

■r^9 event occurs indicating that the mouse pointer has been removed 

nio from the link (a "mouseout" event) , a time t2 for the 

f|l occurrence is obtained. If Ti < (t2 - tl) < Th is established 

Ci2 for the threshold values Ti and Th/ it is assumed that a link 

23 pointing operation using the mouse has been performed. The 

24 text extractor 30 is notified to this effect, and the position 

25 information for the mouse pointer that is obtained by the 

26 "mousemove" event is transmitted to the text extractor 30. 

27 The threshold values Ti and Th are provided in order to exclude 

28 a case wherein a mouse pointer simply passes a link and a case 

29 wherein the mouse pointer accidentally remains on the link for 
3 0 an extended period of time. That is, when Ti ^ (t2 - tl) is 
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1 established, it is assumed that the mouse pointer merely passed 

2 the link, and no notification is transmitted to the text 

3 extractor 30. And when (t2 - tl) ^ is established, it is 

4 assumed that the mouse pointer accidentally remained on the 

5 link, and again, no notification is transmitted to the text 

6 extractor 30. 

7 When the text extractor 3 0 receives a notification that a link 

8 pointing operation has been performed, and when the place 

9 whereat the pointing operation took place is a link tag, the 

10 text extractor 3 0 extracts text for which the link tag is 

11 provided as a sentence or a line unit. If at the site whereat 

12 the pointing operation has been performed is a link that is 
5i3 extended to a predetermined location, the text extractor 3 0 
li4 extracts as a sentence or a line unit the text that includes 
is that link. 

A method for extracting text as a sentence or a line unit will 

^17 now be described. To delimit text by separating it into 

^8 sentences or lines, first, the range of the text to be 

Ms extracted is gradually expanded from the position (coordinate 

J^^O value) whereat the target link tag for the pointing operation 

Z21 is provided or whereat the pointing operation has been 

22 performed. When the return code or a symbol, such as a period 

23 or a comma, that represents a delimiter for a line or a 

24 sentence appears, the expansion of the range of the text is 

25 halted, and the obtained text string is extracted. 

2 6 Fig, 5 is a diagram for explaining the program required when 

27 the text extraction process of the text extractor 3 0 is 

2 8 implemented by using dynamic HTML. Fig. 6 is a diagram for 

29 explaining the text extraction process that is performed when 
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1 the pointing operation is performed for a link (the underlined 

2 "cat")/ in the sentence "This cat is very smart," contained in 

3 the web contents document that is displayed by the web browser. 

4 In this example, the "getLinkTagText " function and the 

5 "getLinkText " function are defined as the functions used for 

6 extracting text, 

7 The "getLinkTagText" function is a function for extracting text 

8 for which a link tag is provided, and the argument is an anchor 

9 object, an "anchor," On the third line of the program list in 

10 Fig. 5, all the text for which the pertinent link tag is 

11 provided is extracted. The "getLinkText" function is a 

12 function for extracting as a sentence* or a line unit text that 
%li includes a portion into which a link is extended, and the 

U4 arguments are the coordinates where the mouse pointer is 

3^5 located. The text extraction processing performed by the 

fl6 "getLinkText" function will now be described while referring to 

S7 Fig. 6. 

b18 On the eighth line of the program list in Fig. 5, the 

39 "createTextRange" method is employed for the "body" object, and 

1120 the "TextRange" object that includes the entire page of the web 

p|l page is generated ("TextRange" object 601 in Fig. 6). Then, on 

£22 the ninth line of the program list, the "moveToPoint " method is 

23 employed to designate, as a "TextRange" object, a character 

24 that is pointed at the mouse pointer ("TextRange" object 602 in 

25 Fig. 6) . Next, on the tenth line of the program list, the 

26 function for changing the selected area of the text (the 

27 "changeTextRange" function in Fig. 6 is designated for the 

28 performance of this process) is employed to expand the selected 
2 9 range for the "TextRange" object to include a sentence unit or 

30 a line unit ("TextRange" object 603 in Fig. 6). 

31 Finally, on the eleventh line of the program list, the "Text" 
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1 property of the TextRange object is employed to extract "This 

2 cat is very smart," (text 604 in Fig. 6). 

3 An explanation will now be given for a case wherein clicking on 

4 a link is the specific operation that is performed. Of the 

5 events in the operating event string received from the 

6 operating event detector 10, the operating event analyzer 2 0 

7 employs an event that occurs when a link is clicked on to 

8 detect the link clicking operation. As well as for the link 

9 pointing operation, in this embodiment, at the same time an 

10 event occurs, the text for which a link tag is provided and the 

11 text that includes a portion into which a link is extended are 

12 extracted as a sentence unit or a line unit. Specifically, 

13 when an event occurs indicating that a mouse pointer has been 
1J4 placed on a link (a "mouseover" event), the occurrence time tl 
^5 is stored. Then, when a mouse moving event (a "mousemove" 

IU6 event) occurs, the position (a coordinate value) of the mouse 

pointer on the link is obtained. Following this, when a click 

sl8 event (a "click" event) occurs, the text extractor 3 0 is 

29 notified of this event occurrence, and position information for 

f^O the mouse pointer, which was obtained at the time of the 

L2I "mousemove" event, is transmitted to the text extractor 30. 

22 When the text extractor 30 receives notification of the link 

23 clicking operation, and when the place whereat the link click 

24 operation occurred is a link tag, the text extractor 3 0 

25 extracts as a sentence or a line unit the text associated with 
2 6 the link tag. And if the place whereat the link clicking 

27 operation occurs is a link that is extended into a 

2 8 predetermined portion of a sentence, the text including the 

29 link is extracted as a sentence or a line unit. Since the text 

3 0 extraction process is performed by the text extractor 3 0 in the 
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1 same manner as the pointing operation, no further explanation 

2 for it will be given. 

3 An explanation will now be given for the tracing and reading 

4 operation. The tracing and reading operation is extracted by 

5 using the position (the coordinates) of a mouse pointer that is 

6 obtained by using a mouse movement event and an event 

7 occurrence time. The movement of a mouse during the tracing 

8 and reading operation is linear and horizontal, and various 

9 methods can be used for detecting this movement. However, for 

10 this embodiment the following method is employed. 

11 First, the sequential horizontal movement of the mouse is 
ife detected. When the distance that the mouse travels 

ij3 sequentially and horizontally is equal to or greater than a 

3:4 predetermined threshold value, this movement is detected as a 

fl|5 tracing and reading operation. This is because an accidental 

B;6 linear, horizontal movement of the mouse is excluded. Since it 

sl7 is expected that a mouse would not travel far during such an 

5i8 accidental movement, an appropriate threshold value is set to 

il9 exclude it. Then, each time a mouse moving event occurs, the 

5^0 sequential horizontal movement of the mouse can be detected and 

□l used to determine the following conditions. 

22 First, the inclination of the movement of the mouse, which is 

23 obtained from several (two to four) of the latest coordinates 

24 for the mouse pointer, is employed to determine whether the 

25 mouse is being moved horizontally across on a display screen. 

26 Second, a difference in the occurrence times between a current 

27 event and an iimnediately preceding event is employed to 
2 8 determine whether the movement of the mouse has been 

29 discontinued. 
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1 When the above conditions are established and when it is 

2 ascertained that the mouse is moving horizontally and that its 

3 movement has not been discontinued, it is assumed that the 

4 mouse is traveling sequentially and horizontally. And when one 

5 of the two conditions is not established, it is assumed that 

6 the sequential horizontal movement has been terminated. 

7 Based on the above premise, an explanation will now be given 

8 for the process performed by the operation event analyzer 2 0 to 

9 detect the tracing and reading operation. In the following 

10 explanation, a parameter Ar is a threshold value related to the 

11 inclination used to determine whether the direction in which a 

12 mouse travels is to be regarded as the horizontal direction. A 
hb parameter Tr is a threshold value related to a stop time used 
^lli to determine whether the sequential movement of the mouse is 

35 continuing. And a parameter L is a threshold value related to 

11J6 the distance of the travel used to determine whether the 

3;7 sequential horizontal movement that has been detected is a 

=18 tracing and reading operation. While the coordinates that are 

yj9 used are represented by orthogonal x-y coordinates, X being 

ryo defined as the horizontal direction across the display screen 

1:21 (the direction parallel to lines) , and Y direction being 

□2 defined as the vertical direction on the display screen (the 

23 direction perpendicular to lines) . 

24 Each time the "mousemove" event occurs, the operation event 

25 analyzer 20 obtains the difference (Xi-xi-n, yi-yi-n) between the 

2 6 coordinate (xi, yi) of the mouse pointer and the coordinate 

27 (Xi_n/ yi-n) of the mouse pointer when the "mousemove" event 

28 occurred n times before. When the difference in the x 

29 direction (horizontal) is a positive value, the inclination a 

3 0 is calculated using the following equation 

31 a = (yi - yi-n)/(Xi - xi-n) . The time interval td between the time 
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1 ti of the last event occurrence and the time ti-i of the 

2 preceding event occurrence is calculated using the following 

3 equation 

4 td = ti - ti-i. 

5 One of the following four processing types is performed in 

6 accordance with the obtained values for a and td. 

7 (1) A case wherein the flag rfiag that represents the 

8 sequential horizontal movement is OFF and a < Ar and td < 

9 Tr have been established (the inclination and the time 

10 interval from the preceding event fall within the range of 

ii. the threshold value) , and it is assumed that the 

U2 horizontal and sequential movement of the mouse has begun, 

3:3 the flag rfiag has been set to ON and the coordinates (xi, 

1114 Yi) of the mouse pointer have been stored. 

^ ~ 

h15 (2) a case wherein the flag rfiag is OFF and a ^ Ar or td ^ 

He Tr has been established (at the least, either the 

n? inclination or the time interval from the preceding event 

iifs exceeds the range of the threshold value) , and it is 

Ci9 assumed that the mouse is not traveling horizontally and 

20 sequentially. 

21 (3) A case wherein the flag rfiag is ON and a < Ar and td < 

22 Tr have been established, and it is assumed that the mouse 

23 is moving horizontally and sequentially and that the 

24 coordinates (xi, yi) of the mouse pointer have been stored. 

25 (4) A case wherein the flag rfiag is ON and a ^ Ar or td ^ Tr 

26 has been established, and it is assumed that the 
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1 horizontal and sequential movement of the mouse has ended 

2 and the flag rfiag is set to OFF. The stored coordinates of 

3 the mouse pointer that are obtained while the mouse was 

4 moving horizontally and sequentially are employed to 

5 calculate the x coordinates at the start point and the end 

6 point of the movement, the average of the y coordinates 

7 obtained during the movement, and the distance 1 of the 

8 movement. If 1 > L, the distance of the extracted 

9 movement is greater than the threshold value L, and this 

10 movement is not determined to be a tracing and reading 

11 operation. But if 1 ^ L, the movement is determined to be 

12 a tracing and reading operation. 

.^3 When a tracing and reading operation is detected in the above 

^3|4 described manner, the operation event analyzer 20 notifies the 

is text extractor 3 0 that the tracing and reading operation has 

116 been performed, and also transmits to the text extractor 3 0 the 

coordinates (position information) of the mouse pointer at the 

h18 tracing and reading start point and end point that were 

H9 obtained for the "mousemove" event. 

Jiio Upon receipt of the notification that the reading and tracing 

Ql operation has been performed, the text extractor 3 0 extracts 

22 text at the place whereat the tracing and reading operation was 

23 performed. In this case, the text on a line that the mouse 

24 pointer overlapped during the reading and tracing operation, 

25 and the text on the line immediately above are extracted as 

26 sentence or line units. This is because during a tracing and 

27 reading operation a user tends to read a line that the mouse 

28 pointer overlaps or the line immediately above. Therefore, 

29 since the text on the line that the mouse pointer overlaps and 
3 0 on the line immediately above is extracted, the information 
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1 that the user seems to be interested in seldom is not the 

2 extraction target. The text may also be extracted from either 

3 the line the mouse pointer overlaps or the line immediately 

4 above, instead of being extracted from the two lines, 

5 To identify the line immediately above the line overlapped by 

6 the mouse pointer, the lines are sequentially examined upward 

7 in the y coordinate direction by employing the position of the 

8 mouse pointer as a reference, and when a character string that 

9 is detected has been changed, it is assumed that the line has 

10 been shifted to the line immediately above. Specifically, at 

11 first, three characters, i.e., a character whereat the mouse 

12 pointer is located, a character m characters before and a 

13 character n characters after, m and n being numerals equal to 
^k or greater than two, are stored. Then, the coordinates are 
^•115 moved several dots from the position of the mouse pointer in 

4^6 the y coordinate direction, and a character at the sequentially 

fi?7 obtained coordinates, a character positioned m characters 

3;8 before and a character positioned n characters after are 

3I9 obtained. These obtained characters are compared with the 

^0 character at the position whereat the mouse pointer is located, 

fgl and the characters before and after that character, all of 

1^2 which are stored in advance. When the three characters all 

□3 match, it is assumed that the line is still that one overlapped 

24 by the mouse pointer, and for the other case, it is assumed 

25 that the current line is the one immediately above. 

2 5 An explanation will now be given for the reason that a total of 

27 three characters, the character whereat the mouse pointer is 

2 8 located and the two characters that precede and succeed it by 

29 several characters, are employed in order to identify the line 

30 the mouse pointer overlapped and the line immediately above. 

31 When only one character is employed to identify these two 
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1 lines, the character pointed at by mouse pointer may by 

2 accident match the character positioned above. Thus, to 

3 increase the reliability, a plurality of characters are 

4 employed to identify the lines. Characters that are separated 

5 by. a distance of several characters from the character at which 

6 the mouse pointer is pointed are employed because when the same 

7 word is positioned above a word that includes a character that 

8 the mouse pointer overlaps, several characters between the 

9 upper and lower lines, including the character at the position 

10 of the mouse pointer, may be identical, and thus the 

11 possibility that this phenomenon may occur must be eliminated. 

12 Further, characters that are located above and below the 

13 character at the position of the mouse pointer are employed for 
5:4 the following reason. When the character at the position of 

"ajS the mouse pointer is the first or the last on a page, and when 

3^6 characters are extracted only forward or backward, the 

fU? character positioned at a distance of several characters from 

i _ g 

the character pointed at by the mouse pointer may not be on the 

-19 pertinent page; however, so long as characters used for 

.^0 comparison are extracted before and after the character at the 

rdl position of the mouse pointer, the lines can be identified. 

1=^2 The processing for identifying the line immediately above the 

□ 3 line the mouse pointer overlaps will now be described while 

24 referring to Fig. 7. In Fig. 7, the three characters ("j," "r" 

25 and "e") on the line the mouse pointer overlaps are stored. 

26 Later, the target coordinates (the selection range for the 

27 "TextRange" object that will be described later) are moved 

28 upward from the coordinates at the position of the mouse 

29 pointer in the y coordinate direction, and the three characters 
3 0 ("i," "r" and "0") on the line at the coordinates are obtained. 

31 The characters at the position of the mouse pointer are matched 

32 "r"s, but the characters in the two pairs "j" and "i" and "e" 
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1 and "0" differ, so that the line can be assumed to be the one 

2 immediately above. 

3 Based on the above premise, an explanation will now be given 

4 for the processing performed by the text extractor 30 to 

5 extract text from the target line to be used for the tracing 

6 and reading operation and the line immediately above. Fig. 8 

7 is a diagram for explaining the program that is required when 

8 the text extraction process of the text extractor 3 0 is 

9 implemented by using dynamic HTML. 

10 In this example, the "getTracedText " function is defined to 

11 extract the text when the tracing and reading operation is 

12 detected. The "getTracedText" function is a function whereby 
ib after the operation event analyzer 2 0 has detected the tracing 
^1^4 and reading operation, the coordinates of the mouse pointer are 
4?5 employed to extract the text on the line the mouse pointer 

rtl6 overlaps or on the line immediately above. The arguments x and 

^7 y are the coordinates (x, y) whereat the mouse pointer is 

ri8 located. In addition, "up" denotes the line to be extracted, 

h:9 and when up = false, the line that the mouse pointer overlaps 

r^O is extracted, while when up = true, the line immediately above 

Jil the line the mouse pointer overlaps is extracted. On the third 

^2 line of the program listing in Fig. 8, the "TextRange" object 

23 is generated, and on the fourth line, the selection range for 

24 the "TextRange" object is shifted to the character that is 

25 positioned at the coordinates (x, y) whereat the mouse pointer 

26 is located. 

27 The process described on the fifth to twenty-fifth lines of the 

28 program list is one used to identify the line immediately above 

29 the line the mouse pointer overlaps. First, on the seventh to 
3 0 eleventh lines, the three characters (centercharl , rightcharl 
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1 and leftcharl) on the line the mouse pointer overlaps are 

2 obtained. These characters are the character (centercharl) at 

3 the position of the mouse pointer, the character (rightcharl) 

4 positioned after at a distance of CMOVE characters, and the 

5 character (leftcharl) positioned before at a distance of CMOVE 

6 characters . 

7 On the twelfth to twenty-fourth lines, the coordinates are 

8 moved up from the position of the mouse pointer in the y 

9 coordinate direction PMOVE points. Then, the three characters 

10 (centerchar2 , rightchar2 and leftchar2), i.e., the character at 

11 the current position and the two characters positioned CMOVE 

12 characters to the front and the rear, are obtained. These 
obtained characters are compared with the characters 

•J I 

-114 (centercharl, rightcharl and leftcharl) obtained from the 

1l^5 seventh to the eleventh lines, and if even one character is 

01 

hl6 different, the line is identified as the line immediately 

^7 above . 

-18 Thereafter, on the twenty-sixth line, the selection range of 

EC9 the "TextRange" object is expanded until it is equal to a 

fUO sentence or a line unit, and on the twenty-seventh line, the 

J^^l text on the pertinent line is extracted. The same method is 

LI 

p2 employed as that which was explained for use for the link 

23 pointing operation, where text is extracted as a sentence or a 

24 line unit from the line the mouse pointer overlaps or from the 

25 line immediately above. 

26 An explanation will now be given for a case where the vertical 

27 tracing and reading operation is the specific one that is 

28 performed. 

29 For the vertical tracing and reading operation, a mouse pointer 
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1 is pointed at lines of text that are being read as it is 

2 gradually moved down, little by little, in the direction 

3 perpendicular to the lines. Thus, each movement of a mouse 

4 during this operation is performed very slowly, and spans only 

5 a short distance. The vertical tracing and reading operation 

6 is extracted by using the coordinates (x, y) of the mouse 

7 pointer that are obtained by the mouse moving event, and the 

8 occurrence time for the event. Various methods have been 

9 proposed for the detection of the vertical tracing and reading 

10 operation, but the following method is employed for this 

11 embodiment. 

12 First, the sequential vertical travel of the mouse is detected. 

13 When the distance of the sequential vertical travel is equal to 
|f4 or greater than a threshold value, it is assumed that this 

C-S movement is being used for the vertical tracing and reading 

3:6 operation. This is because the possibility that the vertical 

y I 

f1|7 and linear travel of the mouse is accidental can be eliminated. 

Sjs The sequential vertical movement of the mouse can be detected 

e19 by determining whether the following conditions are established 

Ijo each time the mousemove event occurs. 

nJ 

ijl First, the displacement distance between the coordinates of the 

□2 mouse pointer for the last event and the coordinates of the 

23 mouse pointer for the preceding event is employed to determine 

24 whether the mouse is moving vertically in a window. Since each 
2 5 movement of the mouse during the vertical tracing and reading 

2 6 operation is performed very slowly and spans only a short 

27 distance, instead of using inclination, the displacement 

28 distance for the coordinates of the mouse pointer is employed 

2 9 to determine whether the operation is being performed. 

3 0 Second, a difference in the occurrence time between the last 
31 event and the preceding event is employed to determine whether 
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1 



the movement of the mouse has been discontinued. 



2 When these conditions are established, and when it is 

3 ascertained that the mouse is being moved vertically and that 

4 its movement has not been discontinued, it is assumed that the 

5 mouse is traveling vertically and sequentially. But when one 

6 of the two conditions can not be established, it is assumed 

7 that the vertical and sequential movement has been terminated. 

8 Based on the above described premise, an explanation is given 

9 for the processing performed by the operation event analyzer 2 0 

10 to detect the vertical tracing and reading operation. In the 

11 following explanation, parameters Xr and Yr are threshold 
tfe values and are used for displacement distances during the 

'W travel of the mouse in order to determine whether the direction 

3:4 in which the mouse is moving should be regarded as the vertical 

y 3 

lT|5 one. Parameter Tr is a threshold value for the stop time that 

is used to determine whether the movement of the mouse is 

-17 continuous. And parameter L is a threshold value for the 

2C8 distance travelled that is used to determine whether the 

W9 sequential and vertical movement that is detected is for the 

'^0 vertical tracing and reading operation. The coordinates are 

□l represented by the orthogonal x-y coordinates, with the x 

22 direction being defined as the horizontal direction on the 

23 display screen (i.e., the direction parallel to the lines) and 

24 the y direction being defined as the vertical direction on the 

25 display screen (i.e., the direction perpendicular to the 

26 lines) . 

27 Each time a "mousemove" event occurs, the operation event 

28 analyzer 20 calculates a difference (Xi-Xi_i, yi-yi-i) between the 

29 coordinates (Xi, yi) of the mouse pointer and the coordinates 

30 (Xi-i, yi-i) of the mouse pointer for the preceding "mousemove" 
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1 event. When 0 < yi-yi-i < Yr is established, an absolute value d 

2 of the difference in the x direction is calculated using the 

3 following equation 

4 d = I Xi - Xi_i I . 

5 Further, a time interval td between the occurrence time ti for 
5 the last event and the occurrence time tt-i for the preceding 

7 event is calculated using the following equation 

8 td = ti - ti-i. 

9 In accordance with the values obtained for d and td, one of the 
%p following four process types is performed. 

a!l (1) A case wherein the flag rfiag that represents the 

|1J2 sequential vertical movement is OFF and d < Xr and td < Tr 

: - s 

^"3 are established (the displacement in the x direction and 

nl4 the time interval from the preceding event fall within the 

35 range of the threshold value) , and it is assumed that the 

ni 

fl6 vertical and sequential movement of the mouse has begun, 

l!£7 that the flag rfiag has been set to ON and the coordinates 

Q8 (Xi, yi) of the mouse pointer have been stored. 

19 (2) A case wherein the flag rfiag is OFF and d ^ Xr or td ^ 

20 Tr is established (at the least, either the displacement 

21 in the x direction or the time interval from the preceding 

22 event exceeds the range of the threshold value) , and it is 

23 assumed that the mouse is not travelling vertically and 

24 sequentially. 

25 (3) A case wherein the flag rfiag is ON and d < Xr and td < 
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1 Tr are established, and it is assumed that the mouse is 

2 moving vertically and sequentially and the coordinates (xi, 

3 yi) of the mouse pointer have been stored. 

4 (4) A case wherein the flag rfiag is ON and d g Xr or td § Tr 

5 is established, and it is assumed that the vertical and 

6 sequential movement of the mouse is terminated and the 

7 flag rfiag has been set to OFF. The coordinates of the 

8 mouse pointer that are stored, which are obtained while 

9 the mouse is moving vertically and sequentially, are 

10 employed to calculate the y coordinates at the start point 

11 and the end point of the movement, the average for the x 

12 coordinates obtained during the movement, and the distance 

lb 1 of the movement. If 1 > L, the distance of the 

if I 

C54 extracted movement is greater than the threshold value L, 

ifs and this movement is not determined to be a vertical 

fl|6 tracing and reading operation. But if 1 g L, the movement 

^ is determined to be a vertical tracing and reading 

2I8 operation. 

nl 

fl|9 When a vertical tracing and reading operation is detected in 

5f0 the above described manner, the operation event analyzer 2 0 

Ql notifies the text extractor 3 0 that of the tracing and reading 

22 operation is being performed, and also transmits to the text 

23 extractor 30 the coordinates (position information) of the 

24 mouse pointer at the tracing and reading start point and end 

25 point that were obtained for the mousemove event. 

26 Upon receipt of the notification that a vertical reading and 

27 tracing operation has been performed, the text extractor 3 0 

28 extracts text at the place whereat the vertical tracing and 

29 reading operation has been performed. In this case, the text 
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1 on a line the mouse pointer overlapped during the reading and 

2 tracing operation, and the text on the line immediately above 

3 are extracted as sentence or line units. The text may be 

4 extracted from either the line which was overlapped by the 

5 mouse pointer or the line immediately above. 

6 Since the text extraction processing by the text extractor 30 

7 is performed in the same manner as for the tracing and reading 

8 operation described above, no further explanation for it will 

9 be given, 

10 In the above description, the operation for moving the mouse 

11 pointer horizontally along the lines of the text is called 
4^2 tracing and reading, and the operation for using the mouse 

413 pointer to point at the current line in the text and for slowly 

±4 shifting the mouse pointer down, little by little, in the 

Sl|5 direction perpendicular to the lines is called vertical tracing 

yf6 and reading. This is because it is assumed that the text is 

^17 written horizontally. When the text is written vertically, 

38 however, vertical reading performed along the lines corresponds 

ni|9 to the tracing and reading operation, and horizontal reading 

120 performed perpendicular to the lines corresponds to the 

Ql vertical reading and tracing operation. 

22 The information extraction system in this embodiment is 

23 connected to a network, such as the Internet, and functions as 

24 an information processing apparatus on which a web browser is 
2 5 mounted. That is, the web contents that the information 

2 6 processing apparatus receives from a web server are displayed 

27 by the web browser, and each of the above described operations 

2 8 that a user unintentionally performs, when he or she is 

2 9 browsing through data provided by the displayed web contents. 



DOCKET NUMBERS JA919990254US1 



-41- 



1 is determined to be an operating event and the target text for 

2 the detected operation is extracted. 

3 The various modes that follow can be employed as means for 

4 providing, for the information processing apparatus, the 

5 function of the information extraction system for the 

6 embodiment. Typical modes will now be described while 

7 referring to Figs. 9 to 12 . 

8 In the mode shown in Fig. 9, the operation event detector 10, 

9 the operation event analyzer 20 and the text extractor 3 0 are 

10 written in a script language, such as JavaScript, and are 

11 embedded in advance in web contents 101 that are stored in a 
S,^ web server 100. With this arrangement, when an information 

•4|3 processing apparatus 110 receives the web contents 101 from the 

4^4 web server 100, based on script 102 that is embedded in the web 

i|5 contents 101, a web browser 111 performs a process for 

yje detecting an operating event, a process for analyzing the 

ri7 operating event string and for detecting the above described 

48 specific operation, such as the selection of a character 

fj|9 string, the pointing to a link or the tracing and reading, and 

^^0 a process for extracting, for the pertinent operation, a target 

3l character string that is thereafter transmitted to the web 

22 server 100. The function for returning the extracted text to 

23 the web server 100 may be provided by embedding it, as well as 

24 the operating event detector 10, the operating event analyzer 
2 5 20 and the text extractor 30, as a script in the web contents 

2 6 101, or together with the web contents 101, it may be 

27 distributed as a Java applet to the information processing 

28 apparatus 110. 

29 Since the thus obtained text can be regarded as information 

3 0 that the user has shown an interest in while browsing the web 
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1 contents 101, the web server 100 can employ the text to provide 

2 various services, such as research performed to ascertain web 

3 audience rates and a reduction in the search conditions for a 

4 search engine. 

5 In the mode shown in Fig. 10, the web server 100 includes a 

6 writing processor 120 for writing the operating event detector 

7 10, the operating event analyzer 20 and the text extractor 30 

8 in the web contents 101 using a script language, such as 

9 JavaScript. In this mode, when a request to access the web 

10 contents 101 is issued by the information processing apparatus 

11 110, the writing processor 120 of the web server 100 writes in 

12 the web contents 101 the script for carrying out the functions 
|b of the operating event detector 10, the operating event 

4k analyzer 20 and the text extractor 30. Then, the resultant web 
contents 101 are transmitted to the information processing 

^ i, 

rtj6 apparatus 110. 

y ^ 

Jll Based on the script that is embedded in the received web 

tts contents 101, the web browser 111 of the information processing 

ms apparatus 110 performs the process for detecting an operating 

L20 event, the process for analyzing the operating event string and 

Si for detecting a specific operation, such as the selection of a 

22 character string, the pointing to a link or the tracing and 

23 reading, and the process for extracting, for the pertinent 

24 operation, a target character string that is thereafter 

25 transmitted to the web server 100. The function for 

2 6 transmitting the extracted text to the web server 10 0 may be 

27 provided by embedding it, as well as the operating event 

2 8 detector 10, the operating event analyzer 2 0 and the text 

29 extractor 30, in the web contents 101 as a script, or together 

3 0 with the web contents 101, it may be distributed in the form of 
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1 a Java applet to the information processing apparatus 110. 

2 Since the thus obtained text can be regarded as information 

3 that the user has shown an interest in while browsing the web 

4 contents 101, the web server 100 can employ the text to provide 

5 various services, such as research performed to ascertain web 

6 audience rates and a reduction in the search conditions for a 

7 search engine. 

8 In the mode shown in Fig. 11, a proxy server 13 0 is located 

9 between the web server 100 and the information processing 

10 apparatus 110, and writes the operating event detector 10, the 

11 operating event analyzer 20 and the text extractor 3 0 in the 
i.j2 web contents 101 using a script language, such as JavaScript. 
*5|3 In this mode, when a request to access the web contents 101 is 
4j4 issued by the information processing apparatus 110, the proxy 
pJL=5 server 13 0 receives from the web server 100 the web contents 
5t'6 101 and writes in them the script for carrying out the 

^17 functions of the operating event detector 10, the operating 

38 event analyzer 20 and the text extractor 30. It then transmits 

ffiS the resultant web contents 101 to the information processing 

'Mo apparatus 110. 
f % 

21 Based on the script that is embedded in the received web 

22 contents 101, the web browser 111 of the information processing 

23 apparatus 110 performs the process for detecting an operating 

24 event, the process for analyzing the operating event string to 

25 detect a specific operation, such as the selection of a 

26 character string, the pointing to a link or the tracing and 

27 reading, and the process for extracting, for the pertinent 

28 operation, a target character string that is thereafter 

29 transmitted to the proxy server 130. The function for 
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1 transmitting the extracted text to the proxy server 13 0 may be 

2 provided by embedding it, as well as the operating event 

3 detector 10, the operating event analyzer 20 and the text 

4 extractor 30, as a script in the web contents 101, or together 

5 with the web contents 101, it may be distributed in the form of 

6 a Java applet to the information processing apparatus 110. 

7 Since the thus obtained text can be regarded as information 

8 that the user has shown an interest in while browsing the web 

9 contents 101, the proxy server 13 0 can employ the text to 

10 provide various services, such as research performed to 

11 ascertain web audience rates and a reduction in the search 

12 conditions for a search engine. 

4|3 As a modification of the mode in Fig. 11, the proxy server 13 0 

ir4 may not embed in the web contents 101 the script for carrying 

|||5 out the functions of the operating event analyzer 2 0 and the 

-4l6 text extractor 30, and may permit the information processing 

m 

^1 apparatus 110 merely to detect an operating event. In this 

p;8 case, the operating event analyzer 2 0 and the text extractor 3 0 

Hl|9 are provided for the proxy server 13 0, and the operating event 

^^0 detected by the information processing apparatus 110 is 

§1 transmitted to the proxy server 130. Then, the proxy server 

22 13 0 performs the process for analyzing the operating event 

23 string to detect a specific operation, such as the selection of 

24 a character string, the pointing to a link or the tracing and 

25 reading, and the process for extracting, for the pertinent 

26 operation, the target character string. 

27 To transmit, to the proxy server 130, the operating event that 

28 is detected by the information processing apparatus 110, the 

29 script for transmitting the operating event may be embedded in 
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1 the web contents 101 before the proxy server 13 0 transits them 

2 to the information processing apparatus 110, or a request for 

3 the transmission of an operating event may be issued by the 

4 proxy server 13 0 to the information processing apparatus 110 so 

5 that the information processing apparatus 110 transmits the 

6 operating event to the proxy server 13 0. Furthermore, the 

7 proxy server 13 0 may hold the web contents 101 received from 

8 the web server 100 and the text extractor 3 0 may extract the 

9 text from them, or the web contents 101 may be transmitted to 

10 the proxy server 13 0 by the information processing apparatus 

11 110. 

12 In the mode shown in Fig. 12, a portal site 140, which the 

%B information processing apparatus 110 accesses first when it is 
connected to the Internet, transmits a program file 150 to the 

3=5 information processing apparatus 110. This program file 150 

rti6 implements a local proxy that writes the operating event 

41? detector 10, the operating .event analyzer 20 and the text 

J.8 extractor 30 in the web contents 101 using a script language, 

b;9 such as JavaScript, In this mode, when the information 

fgO processing apparatus 110 accesses the portal site 140, the 

-^1 program file 150 that is stored in a storage unit 141 at the 

g2 portal site 140 is transmitted via a transmission/reception 

23 unit 142 to the information processing apparatus 110. The 

24 program file 150 is prepared, for example, as a Java applet. 

25 The program file 150 that is transmitted by the portal site 140 

26 to the information processing apparatus 110 serves as a local 

27 proxy 160 in the information processing apparatus 110. The 

28 local proxy 160 writes, in the web contents 101 received from 

29 the web server 100, a script for implementing the functions of 

30 the operating event detector 10, the operating event analyzer 

31 2 0 and the text extractor 30, and transmits the resultant web 
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1 contents 101 to the web browser 111. 

2 Based on the script embedded in the web contents 101 that are 

3 received from the local proxy 150, the web browser 111 performs 

4 the process for detecting an operating event, the process for 

5 analyzing the operating event string to detect a specific 

6 operation, such as the selection of a character string, the 

7 pointing to a link or the tracing and reading, and the process 

8 for extracting, for the pertinent operation a target character 

9 string that thereafter is transmitted to the portal, site 140. 

10 The function for transmitting the extracted text to the proxy 

11 server 13 0 may be provided by embedding it, as well as the 

12 operating event detector 10, the operating event analyzer 20 
gj3 and the text extractor 30, as a script in the web contents 101, 
Ik or it may be provided as a function of the local proxy 160. Or 
4.f5 else, the transmission/reception unit 142 of the portal site 
|uj6 140 may issue a request, to the information processing 

Wv apparatus 110, for the transmission of the extracted text, 

Li s 

118 which it thereafter collects. 

p|9 Since the thus obtained text can be regarded as information 

r^O that the user has shown an interest in while browsing the web 

Ql contents 101, the portal site 140 can employ the text to 

22 provide various services, such as research performed to 

23 ascertain web audience rates and a reduction in the search 

24 conditions for a search engine. 

2 5 Fig. 13 is a diagram showing a comparison of the embodiment 

2 5 with the prior art in the process for employing obtained text 

27 to generate a keyword vector (a selected keyword and the 

28 weighting that represents its importance level) for a search 

29 engine. Using the conventional method, keywords included in 
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1 the overall page are weighted using, for example, the TFolDF 

2 method, and an important keyword is extracted. In this 

3 embodiment, however, the keyword weighting process is performed 

4 for the text that is a target for a user's operation. For the 

5 weighting of keywords, a conventional method, such as the IDF 

6 method in the TFolDF method, can be employed. While the keyword 

7 vector that is generated based on the text obtained in this 

8 embodiment can be employed by itself for services, such as 

9 research performed to ascertain web audience rates and a 

10 reduction in the search conditions for a search engine. 

11 Further, as is shown in Fig. 13, the keyword vector can also be 

12 employed with a keyword vector that is conventionally 

13 generated. 

CI 

'^4 In the four modes, the transmission destination of the 

^5 extracted text is not limited to those described above, and the 

nie extracted text can be transmitted to various users who are 

'^1 permitted to use it. For example, in the mode in Fig. 9, the 

-18 extracted text may be transmitted to the creator of the web 

HL9 contents 101 in which the script 102 is embedded. Furthermore, 

fSO in the modes in Figs. 11 and 12, the extracted text may be 

'Ml transmitted to a server that is provided separate from the 

□2 proxy server 13 0 or the portal site 140, and that uses the 

23 extracted text to provide a service. 

24 In this embodiment, text is extracted from web contents based 

25 on a user's operation. However, text may be extracted from 

2 6 document data having another arbitrary form. In this case, an 

27 area for monitoring the movement of a mouse may be set up not 

28 only on a screen whereon web contents are displayed by a web 

29 browser, but also various other areas, such as the entire 
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1 screen of a display device for of a computer or an area in a 

2 window that is displayed by an application program. 

3 In addition, based on a user's operation performed for an 

4 object, such as an image other than text, the information for 

5 the target object can be extracted. In this case, the 

6 operation that is defined as one the user unintentionally 

7 performs for an interesting object is the selection of an 

8 object, which is performed in the same manner as is the 

9 selection of text, the pointing to a line, or clicking. 

10 Moreover, input means other than the mouse or another pointing 

11 device may be employed to define the operation that a user 

|fe unintentionally performs for an interesting object. A specific 

a|3 operation can be defined in accordance with, for example, the 

4=4 manipulation of a cursor key, voice input when the user reads 

iis text on a display, or the movement of a user's eyes. 

J.6 Advantages of the Invention 

rfl? As is described above, according to the present invention. 

His while informative input by a user is not required, detailed 

CI 

f|9 information concerning a web content portion that the user is 

20 interested in can be obtained. Further, a detailed record of a 

21 user's operation, including a the manipulation of objects on a 

22 web browser display, can be extracted, and can be used as 

23 information indicating the trend of the user's interest. 

24 The present invention can be realized in hardware, software, or a 
2 5 combination of hardware and software. The present invention can 

2 6 be realized in a centralized fashion in one computer system, or 

27 in a distributed fashion where different elements are spread 
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1 across several interconnected computer systems. Any kind of 

2 computer system - or other apparatus adapted for carrying out the 

3 methods described herein - is suitable. A typical combination of 

4 hardware and software could be a general purpose computer system 

5 with a computer program that, when being loaded and executed, 

6 controls the computer system such that it carries out the methods 

7 described herein. The present invention can also be embedded in a 

8 computer program product, which comprises all the features 

9 enabling the implementation of the methods described herein, and 

10 which - when loaded in a computer system - is able to carry out 

11 these methods. 

12 Computer program means or computer program in the present context 
lb mean any expression, in any language, code or notation, of a set 
3.J4 of instructions intended to cause a system having an information 
1=5 processing capability to perform a particular function either 
rti6 directly or after conversion to another language, code or 

yfv notation and/or reproduction in a different material form. 

b:8 It is noted that the foregoing has outlined some of the more 

Hl|9 pertinent objects and embodiments of the present invention. 

[go This invention may be used for many applications. Thus, 

-gl although the description is made for particular arrangements and 

22 methods, the intent and concept of the invention is suitable and 

23 applicable to other arrangements and applications. It will be 

24 clear to those skilled in the art that other modifications to 

25 the disclosed embodiments can be effected without departing from 

26 the spirit and scope of the invention. The described 

27 embodiments ought to be construed to be merely illustrative of 

28 some of the more prominent features and applications of the 

29 invention. Other beneficial results can be realized by applying 
3 0 the disclosed invention in a different manner or modifying the 



DOCKET JJUMBERg JA919990254US1 



-50- 



• 



1 invention in ways known to those familiar with the art. 
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